Analysis performed using antimicrobial susceptibility test data for all isolates from blood cultures collected at NUTH from Q1 2019 onwards.
Newcastle upon Tyne Hospitals NHS Foundation Trust
Friday, 30 June, 2023
Introduce R
Point out useful learning resources
Run through a working example of a project completed (in its entirety) using R
R is one of the most commonly used languages for data science, together with Python.
R is a powerful, free open source data science and statistics environment, used in industry, academia and major corporations (eg Microsoft, Google, Facebook).
R benefits from a worldwide community that freely shares learning and resources, through e.g. GitHub
Yes, to use R, we need to learn some code
No, it’s not rocket science.
Data has transformed our world in powerful ways and can help us make better decisions.
Almost every interaction with the health service leaves a digital trace - raw information that has phenomenal potential.
But raw data is not powerful on its own. It must be shaped, checked, curated and analysed. And then it must be communicated, and acted upon. This work requires people, with modern data skills, in teams, using platforms like R to do the heavy lifting and avoid needless duplication of effort.
The Goldacre report actively promotes the use of R in the NHS.
NUTH now actively supports the use of R at scale, and it can be installed on any work PC (simply call IT and ask to be added the “SCCM-R” group)
The Health Foundation supports NHS-R, which delivers free-to-NHS-staff online training.
It’s free to register.
Courses are really popular and spaces are limited to about 20 per session. Sessions are scheduled once a month. To be notified when further dates are scheduled, please contact: nhs.rcommunity@nhs.net
NHS-R runs the premier data science conference in the NHS, along with regular skill-based webinars.
The #Rstats and #TidyTuesday hashtags are excellent
There are lots of good online courses
And lots of excellent freely available textbooks.
We know that projects can quickly lose momentum and get stuck
“Just one more subanalysis”
Think: Re-running reports after filtering the data by various substrata.
“Just a bit more data”
Think: Re-running reports when more data becomes available.
“Just (one more… last… final!) final report”
Think: Re-drafting the same report multiple times to get your paper accepted through peer review
Aim: Predict Antimicrobial Resistance (AMR) Rates for Blood Culture Isolates at NUTH, using R
Objectives:
Import blood culture data into R
Wrangle, visualise, and exploring data using R
Analyse historical AMR rates, and model future AMR rates using R
The AMR package [1,2] is a free, open-source and independent package for R [3] that provides a standard for clean and reproducible analysis and prediction of Antimicrobial Resistance (AMR).
This package was used to determine ‘first isolates’, as per Hindler et al [4], for use in the final analysis; calculate and visualise AMR data; and predict future AMR rates using regression models.
The AMR package [1,2] includes functions which, based on a date column, calculates cases per year and uses a regression model to predict antimicrobial resistance.
The resistance_predict() function creates a prediction model including standard errors (SE), which are returned as columns se_min and se_max.
Valid options for the statistical model (argument model) are: “binomial”, “poisson” and “linear”.
In total, 11098 distinct positive blood cultures were collected from 6888 distinct patients, leading to isolation of 12272 organisms.
Taking into consideration ‘first isolates’ only, 8780 distinct positive blood cultures were collected from 6888 distinct patients, leading to isolation of 9648 organisms.
From this point onwards, this analysis concentrates only on ‘first isolates’ from blood cultures, to intelligently de-duplicate the data
Age of patients with positive blood cultures
Daniel Weiand, Consultant medical microbiologist
Newcastle upon Tyne Hospitals NHS Foundation Trust
Email: dweiand@nhs.net
Twitter: @send2dan
NHS-R community blog: https://nhsrcommunity.com/author/daniel-weiand/
GitHub: send2dan